Overview

Dataset statistics

Number of variables12
Number of observations35798
Missing cells61436
Missing cells (%)14.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.3 MiB
Average record size in memory96.0 B

Variable types

Categorical2
Numeric10

Warnings

INSEE commune has a high cardinality: 35798 distinct values High cardinality
Commune has a high cardinality: 33338 distinct values High cardinality
Résidentiel is highly correlated with TertiaireHigh correlation
Tertiaire is highly correlated with RésidentielHigh correlation
Autres transports has 25819 (72.1%) missing values Missing
Autres transports international has 32907 (91.9%) missing values Missing
Energie has 1308 (3.7%) missing values Missing
Industrie hors-énergie has 1308 (3.7%) missing values Missing
Autres transports is highly skewed (γ1 = 32.95394514) Skewed
Autres transports international is highly skewed (γ1 = 22.76243785) Skewed
CO2 biomasse hors-total is highly skewed (γ1 = 25.19224145) Skewed
Déchets is highly skewed (γ1 = 29.28677041) Skewed
Energie is highly skewed (γ1 = 72.79009042) Skewed
Industrie hors-énergie is highly skewed (γ1 = 89.5297339) Skewed
INSEE commune is uniformly distributed Uniform
Commune is uniformly distributed Uniform
INSEE commune has unique values Unique
CO2 biomasse hors-total has unique values Unique

Reproduction

Analysis started2021-02-18 21:33:48.972481
Analysis finished2021-02-18 21:34:03.604077
Duration14.63 seconds
Software versionpandas-profiling v2.10.0
Download configurationconfig.yaml

Variables

INSEE commune
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct35798
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size279.8 KiB
19227
 
1
63161
 
1
29011
 
1
47027
 
1
65408
 
1
Other values (35793)
35793 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters178990
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35798 ?
Unique (%)100.0%

Sample

1st row01001
2nd row01002
3rd row01004
4th row01005
5th row01006
ValueCountFrequency (%)
192271
 
< 0.1%
631611
 
< 0.1%
290111
 
< 0.1%
470271
 
< 0.1%
654081
 
< 0.1%
808141
 
< 0.1%
314611
 
< 0.1%
292961
 
< 0.1%
622421
 
< 0.1%
590721
 
< 0.1%
Other values (35788)35788
> 99.9%
2021-02-18T22:34:04.114159image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
192271
 
< 0.1%
803951
 
< 0.1%
021881
 
< 0.1%
481741
 
< 0.1%
764771
 
< 0.1%
630561
 
< 0.1%
583021
 
< 0.1%
090771
 
< 0.1%
760261
 
< 0.1%
631611
 
< 0.1%
Other values (35788)35788
> 99.9%

Most occurring characters

ValueCountFrequency (%)
123687
13.2%
223145
12.9%
022832
12.8%
319023
10.6%
517370
9.7%
416956
9.5%
616008
8.9%
715274
8.5%
813628
7.6%
910707
6.0%
Other values (2)360
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number178630
99.8%
Uppercase Letter360
 
0.2%

Most frequent character per category

ValueCountFrequency (%)
123687
13.3%
223145
13.0%
022832
12.8%
319023
10.6%
517370
9.7%
416956
9.5%
616008
9.0%
715274
8.6%
813628
7.6%
910707
6.0%
ValueCountFrequency (%)
B236
65.6%
A124
34.4%

Most occurring scripts

ValueCountFrequency (%)
Common178630
99.8%
Latin360
 
0.2%

Most frequent character per script

ValueCountFrequency (%)
123687
13.3%
223145
13.0%
022832
12.8%
319023
10.6%
517370
9.7%
416956
9.5%
616008
9.0%
715274
8.6%
813628
7.6%
910707
6.0%
ValueCountFrequency (%)
B236
65.6%
A124
34.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII178990
100.0%

Most frequent character per block

ValueCountFrequency (%)
123687
13.2%
223145
12.9%
022832
12.8%
319023
10.6%
517370
9.7%
416956
9.5%
616008
8.9%
715274
8.5%
813628
7.6%
910707
6.0%
Other values (2)360
 
0.2%

Commune
Categorical

HIGH CARDINALITY
UNIFORM

Distinct33338
Distinct (%)93.1%
Missing0
Missing (%)0.0%
Memory size279.8 KiB
SAINTE-COLOMBE
 
13
SAINT-SAUVEUR
 
11
LE PIN
 
10
SAINT-AUBIN
 
10
BEAULIEU
 
10
Other values (33333)
35744 

Length

Max length45
Median length10
Mean length11.76677468
Min length1

Characters and Unicode

Total characters421227
Distinct characters41
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31762 ?
Unique (%)88.7%

Sample

1st rowL'ABERGEMENT-CLEMENCIAT
2nd rowL'ABERGEMENT-DE-VAREY
3rd rowAMBERIEU-EN-BUGEY
4th rowAMBERIEUX-EN-DOMBES
5th rowAMBLEON
ValueCountFrequency (%)
SAINTE-COLOMBE13
 
< 0.1%
SAINT-SAUVEUR11
 
< 0.1%
LE PIN10
 
< 0.1%
SAINT-AUBIN10
 
< 0.1%
BEAULIEU10
 
< 0.1%
SAINT-REMY10
 
< 0.1%
SAINT-LOUP10
 
< 0.1%
BEAUMONT9
 
< 0.1%
SAINT-MICHEL9
 
< 0.1%
SAINT-MARCEL9
 
< 0.1%
Other values (33328)35697
99.7%
2021-02-18T22:34:04.437996image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
la1092
 
2.9%
le782
 
2.0%
les336
 
0.9%
de22
 
0.1%
val18
 
< 0.1%
en16
 
< 0.1%
sainte-colombe13
 
< 0.1%
saint-sauveur11
 
< 0.1%
pin11
 
< 0.1%
saint-remy10
 
< 0.1%
Other values (33207)35906
94.0%

Most occurring characters

ValueCountFrequency (%)
E53315
12.7%
A37271
 
8.8%
N31465
 
7.5%
L30297
 
7.2%
S30118
 
7.2%
R29546
 
7.0%
I28390
 
6.7%
-25771
 
6.1%
O22346
 
5.3%
U20779
 
4.9%
Other values (31)111929
26.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter392121
93.1%
Dash Punctuation25771
 
6.1%
Space Separator2419
 
0.6%
Other Punctuation841
 
0.2%
Decimal Number63
 
< 0.1%
Open Punctuation6
 
< 0.1%
Close Punctuation6
 
< 0.1%

Most frequent character per category

ValueCountFrequency (%)
E53315
13.6%
A37271
9.5%
N31465
 
8.0%
L30297
 
7.7%
S30118
 
7.7%
R29546
 
7.5%
I28390
 
7.2%
O22346
 
5.7%
U20779
 
5.3%
T19960
 
5.1%
Other values (16)88634
22.6%
ValueCountFrequency (%)
122
34.9%
26
 
9.5%
35
 
7.9%
45
 
7.9%
55
 
7.9%
65
 
7.9%
74
 
6.3%
84
 
6.3%
94
 
6.3%
03
 
4.8%
ValueCountFrequency (%)
'841
100.0%
ValueCountFrequency (%)
-25771
100.0%
ValueCountFrequency (%)
2419
100.0%
ValueCountFrequency (%)
(6
100.0%
ValueCountFrequency (%)
)6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin392121
93.1%
Common29106
 
6.9%

Most frequent character per script

ValueCountFrequency (%)
E53315
13.6%
A37271
9.5%
N31465
 
8.0%
L30297
 
7.7%
S30118
 
7.7%
R29546
 
7.5%
I28390
 
7.2%
O22346
 
5.7%
U20779
 
5.3%
T19960
 
5.1%
Other values (16)88634
22.6%
ValueCountFrequency (%)
-25771
88.5%
2419
 
8.3%
'841
 
2.9%
122
 
0.1%
26
 
< 0.1%
(6
 
< 0.1%
)6
 
< 0.1%
35
 
< 0.1%
45
 
< 0.1%
55
 
< 0.1%
Other values (5)20
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII421227
100.0%

Most frequent character per block

ValueCountFrequency (%)
E53315
12.7%
A37271
 
8.8%
N31465
 
7.5%
L30297
 
7.2%
S30118
 
7.2%
R29546
 
7.0%
I28390
 
6.7%
-25771
 
6.1%
O22346
 
5.3%
U20779
 
4.9%
Other values (31)111929
26.6%

Agriculture
Real number (ℝ≥0)

Distinct35576
Distinct (%)99.6%
Missing62
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean2459.97576
Minimum0.003431569
Maximum98949.31776
Zeros0
Zeros (%)0.0%
Memory size279.8 KiB
2021-02-18T22:34:04.556018image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.003431569
5-th percentile198.1526881
Q1797.6826308
median1559.381286
Q33007.883903
95-th percentile7846.964087
Maximum98949.31776
Range98949.31433
Interquartile range (IQR)2210.201272

Descriptive statistics

Standard deviation2926.957701
Coefficient of variation (CV)1.189831928
Kurtosis60.09142074
Mean2459.97576
Median Absolute Deviation (MAD)930.7634055
Skewness4.604079434
Sum87909693.75
Variance8567081.385
MonotocityNot monotonic
2021-02-18T22:34:04.680618image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.7818854828
 
0.1%
3.3084948524
 
0.1%
303.836964416
 
< 0.1%
294.576332914
 
< 0.1%
23.414291589
 
< 0.1%
48.361025268
 
< 0.1%
1000.6522045
 
< 0.1%
63.162324615
 
< 0.1%
452.46300274
 
< 0.1%
570.44047553
 
< 0.1%
Other values (35566)35620
99.5%
(Missing)62
 
0.2%
ValueCountFrequency (%)
0.0034315691
< 0.1%
0.0048299111
< 0.1%
0.0101403251
< 0.1%
0.0324442091
< 0.1%
0.0388487971
< 0.1%
ValueCountFrequency (%)
98949.317761
< 0.1%
70939.757631
< 0.1%
53369.587211
< 0.1%
42827.360371
< 0.1%
40087.431921
< 0.1%

Autres transports
Real number (ℝ≥0)

MISSING
SKEWED

Distinct9963
Distinct (%)99.8%
Missing25819
Missing (%)72.1%
Infinite0
Infinite (%)0.0%
Mean654.9199401
Minimum0.000204496
Maximum513140.9717
Zeros0
Zeros (%)0.0%
Memory size279.8 KiB
2021-02-18T22:34:04.804981image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.000204496
5-th percentile12.28979284
Q152.56041173
median106.7959278
Q3237.3415008
95-th percentile766.3351019
Maximum513140.9717
Range513140.9715
Interquartile range (IQR)184.7810891

Descriptive statistics

Standard deviation9232.816833
Coefficient of variation (CV)14.0976267
Kurtosis1364.758616
Mean654.9199401
Median Absolute Deviation (MAD)69.05188221
Skewness32.95394514
Sum6535446.083
Variance85244906.67
MonotocityNot monotonic
2021-02-18T22:34:04.924339image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
133.23854542
 
< 0.1%
261.24222552
 
< 0.1%
44.745866582
 
< 0.1%
49.096461272
 
< 0.1%
62.630687722
 
< 0.1%
69.181668982
 
< 0.1%
111.23934012
 
< 0.1%
119.54996882
 
< 0.1%
83.082753512
 
< 0.1%
60.988526782
 
< 0.1%
Other values (9953)9959
 
27.8%
(Missing)25819
72.1%
ValueCountFrequency (%)
0.0002044961
< 0.1%
0.0006782281
< 0.1%
0.0017573751
< 0.1%
0.0019662361
< 0.1%
0.005246761
< 0.1%
ValueCountFrequency (%)
513140.97171
< 0.1%
308100.16381
< 0.1%
271182.75861
< 0.1%
270757.6651
< 0.1%
245375.41871
< 0.1%

Autres transports international
Real number (ℝ≥0)

MISSING
SKEWED

Distinct2883
Distinct (%)99.7%
Missing32907
Missing (%)91.9%
Infinite0
Infinite (%)0.0%
Mean7692.34496
Minimum0.000397295
Maximum3303393.666
Zeros0
Zeros (%)0.0%
Memory size279.8 KiB
2021-02-18T22:34:05.030211image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.000397295
5-th percentile2.140943147
Q110.05096686
median19.92434287
Q332.98311078
95-th percentile208.5806101
Maximum3303393.666
Range3303393.666
Interquartile range (IQR)22.93214392

Descriptive statistics

Standard deviation113764.3095
Coefficient of variation (CV)14.78928858
Kurtosis584.308405
Mean7692.34496
Median Absolute Deviation (MAD)10.86458724
Skewness22.76243785
Sum22238569.28
Variance1.294231811 × 1010
MonotocityNot monotonic
2021-02-18T22:34:05.133457image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
27.161944842
 
< 0.1%
27.011767342
 
< 0.1%
3303393.6662
 
< 0.1%
0.4366271752
 
< 0.1%
9.2711960172
 
< 0.1%
23.761099872
 
< 0.1%
65.998879192
 
< 0.1%
16.17721562
 
< 0.1%
7.5536898511
 
< 0.1%
63.20732581
 
< 0.1%
Other values (2873)2873
 
8.0%
(Missing)32907
91.9%
ValueCountFrequency (%)
0.0003972951
< 0.1%
0.0119188491
< 0.1%
0.0389349071
< 0.1%
0.0457683811
< 0.1%
0.0681758171
< 0.1%
ValueCountFrequency (%)
3303393.6662
< 0.1%
2202274.8371
< 0.1%
2109460.0261
< 0.1%
1101145.5451
< 0.1%
1101131.2221
< 0.1%

CO2 biomasse hors-total
Real number (ℝ≥0)

SKEWED
UNIQUE

Distinct35798
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1774.38155
Minimum3.758087577
Maximum576394.1812
Zeros0
Zeros (%)0.0%
Memory size279.8 KiB
2021-02-18T22:34:05.250340image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum3.758087577
5-th percentile79.00816519
Q1197.9511085
median424.8499878
Q31094.749826
95-th percentile6276.826891
Maximum576394.1812
Range576390.4231
Interquartile range (IQR)896.7987173

Descriptive statistics

Standard deviation7871.341922
Coefficient of variation (CV)4.436104469
Kurtosis1174.072663
Mean1774.38155
Median Absolute Deviation (MAD)285.9307994
Skewness25.19224145
Sum63519310.72
Variance61958023.66
MonotocityNot monotonic
2021-02-18T22:34:05.353874image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
183.24115921
 
< 0.1%
142.64468431
 
< 0.1%
3092.6920081
 
< 0.1%
205.99053871
 
< 0.1%
200.65927811
 
< 0.1%
3261.5794691
 
< 0.1%
660.29999111
 
< 0.1%
1219.9572651
 
< 0.1%
1121.6242031
 
< 0.1%
339.6756381
 
< 0.1%
Other values (35788)35788
> 99.9%
ValueCountFrequency (%)
3.7580875771
< 0.1%
7.1629281971
< 0.1%
7.6080321511
< 0.1%
7.963333661
< 0.1%
8.2600984171
< 0.1%
ValueCountFrequency (%)
576394.18121
< 0.1%
371375.03941
< 0.1%
273607.09521
< 0.1%
256360.58261
< 0.1%
253079.44221
< 0.1%

Déchets
Real number (ℝ≥0)

SKEWED

Distinct11016
Distinct (%)30.8%
Missing6
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean410.8063294
Minimum0.132243124
Maximum275500.3744
Zeros0
Zeros (%)0.0%
Memory size279.8 KiB
2021-02-18T22:34:05.464193image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.132243124
5-th percentile9.918234329
Q125.65516613
median54.7486535
Q3110.8209407
95-th percentile246.5316813
Maximum275500.3744
Range275500.2422
Interquartile range (IQR)85.16577452

Descriptive statistics

Standard deviation4122.472608
Coefficient of variation (CV)10.03507569
Kurtosis1275.554838
Mean410.8063294
Median Absolute Deviation (MAD)34.77994171
Skewness29.28677041
Sum14703580.14
Variance16994780.4
MonotocityNot monotonic
2021-02-18T22:34:05.564070image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11.6373949571
 
0.2%
16.5303905570
 
0.2%
17.5883355465
 
0.2%
18.7785236661
 
0.2%
14.6789868160
 
0.2%
12.2986105760
 
0.2%
16.3981474259
 
0.2%
19.4397392859
 
0.2%
13.8855280659
 
0.2%
20.1009549158
 
0.2%
Other values (11006)35170
98.2%
ValueCountFrequency (%)
0.1322431241
< 0.1%
0.2644862491
< 0.1%
0.3967293731
< 0.1%
0.5289724981
< 0.1%
0.6612156222
< 0.1%
ValueCountFrequency (%)
275500.37441
< 0.1%
221513.84581
< 0.1%
187087.72311
< 0.1%
171483.59121
< 0.1%
170815.241
< 0.1%

Energie
Real number (ℝ≥0)

MISSING
SKEWED

Distinct1453
Distinct (%)4.2%
Missing1308
Missing (%)3.7%
Infinite0
Infinite (%)0.0%
Mean662.5698463
Minimum2.354557741
Maximum2535857.559
Zeros0
Zeros (%)0.0%
Memory size279.8 KiB
2021-02-18T22:34:05.673860image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2.354557741
5-th percentile2.354557741
Q12.354557741
median4.709115482
Q351.80027031
95-th percentile958.3050007
Maximum2535857.559
Range2535855.204
Interquartile range (IQR)49.44571257

Descriptive statistics

Standard deviation26455.71422
Coefficient of variation (CV)39.9289439
Kurtosis5774.498622
Mean662.5698463
Median Absolute Deviation (MAD)2.354557741
Skewness72.79009042
Sum22852034
Variance699904815.1
MonotocityNot monotonic
2021-02-18T22:34:05.842822image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.35455774117013
47.5%
4.7091154821354
 
3.8%
7.0636732241174
 
3.3%
9.418230965845
 
2.4%
11.77278871764
 
2.1%
14.12734645574
 
1.6%
16.48190419527
 
1.5%
18.83646193441
 
1.2%
21.19101967411
 
1.1%
23.54557741334
 
0.9%
Other values (1443)11053
30.9%
(Missing)1308
 
3.7%
ValueCountFrequency (%)
2.35455774117013
47.5%
4.7091154821354
 
3.8%
7.0636732241174
 
3.3%
9.418230965845
 
2.4%
11.77278871764
 
2.1%
ValueCountFrequency (%)
2535857.5591
< 0.1%
2296710.8011
< 0.1%
1934988.2471
< 0.1%
1570236.211
< 0.1%
1363401.9341
< 0.1%

Industrie hors-énergie
Real number (ℝ≥0)

MISSING
SKEWED

Distinct1889
Distinct (%)5.5%
Missing1308
Missing (%)3.7%
Infinite0
Infinite (%)0.0%
Mean2423.127789
Minimum1.052998302
Maximum6765118.848
Zeros0
Zeros (%)0.0%
Memory size279.8 KiB
2021-02-18T22:34:05.984643image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1.052998302
5-th percentile6.911213351
Q16.911213351
median13.8224267
Q3152.0466937
95-th percentile3434.873035
Maximum6765118.848
Range6765117.795
Interquartile range (IQR)145.1354803

Descriptive statistics

Standard deviation56703.73779
Coefficient of variation (CV)23.40105134
Kurtosis9680.614239
Mean2423.127789
Median Absolute Deviation (MAD)6.911213349
Skewness89.5297339
Sum83573677.44
Variance3215313880
MonotocityNot monotonic
2021-02-18T22:34:06.157963image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6.91121335116989
47.5%
13.82242671354
 
3.8%
20.733640051172
 
3.3%
27.6448534842
 
2.4%
34.55606675763
 
2.1%
41.4672801573
 
1.6%
48.37849345527
 
1.5%
55.28970681441
 
1.2%
62.20092016410
 
1.1%
69.11213351334
 
0.9%
Other values (1879)11085
31.0%
(Missing)1308
 
3.7%
ValueCountFrequency (%)
1.0529983021
< 0.1%
1.4244887791
< 0.1%
3.5999944271
< 0.1%
3.9161783181
< 0.1%
4.6499026071
< 0.1%
ValueCountFrequency (%)
6765118.8481
< 0.1%
5997332.9781
< 0.1%
2380184.9271
< 0.1%
2005643.4881
< 0.1%
1875871.6931
< 0.1%

Résidentiel
Real number (ℝ≥0)

HIGH CORRELATION

Distinct35791
Distinct (%)> 99.9%
Missing6
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean1783.677872
Minimum1.027266053
Maximum410675.902
Zeros0
Zeros (%)0.0%
Memory size279.8 KiB
2021-02-18T22:34:06.327111image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1.027266053
5-th percentile34.06354299
Q196.05291132
median227.0911926
Q3749.4692931
95-th percentile6615.577065
Maximum410675.902
Range410674.8747
Interquartile range (IQR)653.4163818

Descriptive statistics

Standard deviation8915.902378
Coefficient of variation (CV)4.998605701
Kurtosis507.7795831
Mean1783.677872
Median Absolute Deviation (MAD)166.0310362
Skewness18.30322831
Sum63841398.38
Variance79493315.22
MonotocityNot monotonic
2021-02-18T22:34:06.439372image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20.514744632
 
< 0.1%
255.27614231
 
< 0.1%
803.00170761
 
< 0.1%
415.04752221
 
< 0.1%
327.67756411
 
< 0.1%
96.057021271
 
< 0.1%
668.77969761
 
< 0.1%
113.21622171
 
< 0.1%
145.26691551
 
< 0.1%
371.32016021
 
< 0.1%
Other values (35781)35781
> 99.9%
(Missing)6
 
< 0.1%
ValueCountFrequency (%)
1.0272660531
< 0.1%
1.7086518181
< 0.1%
2.0844721411
< 0.1%
2.2240001271
< 0.1%
2.2940007241
< 0.1%
ValueCountFrequency (%)
410675.9021
< 0.1%
354259.01381
< 0.1%
353586.42461
< 0.1%
307642.41071
< 0.1%
265526.4541
< 0.1%

Routier
Real number (ℝ≥0)

Distinct35749
Distinct (%)99.9%
Missing20
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean3535.501245
Minimum0.555092164
Maximum586054.6728
Zeros0
Zeros (%)0.0%
Memory size279.8 KiB
2021-02-18T22:34:06.585713image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.555092164
5-th percentile126.1394469
Q1419.7004599
median1070.895593
Q33098.612157
95-th percentile13930.8568
Maximum586054.6728
Range586054.1177
Interquartile range (IQR)2678.911697

Descriptive statistics

Standard deviation9663.156628
Coefficient of variation (CV)2.733178681
Kurtosis553.1785582
Mean3535.501245
Median Absolute Deviation (MAD)813.5683708
Skewness15.83081944
Sum126493163.5
Variance93376596.01
MonotocityNot monotonic
2021-02-18T22:34:06.704751image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
35324.1291316
 
< 0.1%
42625.846069
 
< 0.1%
3.6833770662
 
< 0.1%
2.8969973872
 
< 0.1%
3.9609238172
 
< 0.1%
4.2615986422
 
< 0.1%
2.411291132
 
< 0.1%
2.7813524222
 
< 0.1%
752.11315021
 
< 0.1%
541.60911861
 
< 0.1%
Other values (35739)35739
99.8%
(Missing)20
 
0.1%
ValueCountFrequency (%)
0.5550921641
< 0.1%
0.6476076221
< 0.1%
0.8095092951
< 0.1%
0.85576731
< 0.1%
0.9020247561
< 0.1%
ValueCountFrequency (%)
586054.67281
< 0.1%
352836.86431
< 0.1%
279544.85231
< 0.1%
265561.18361
< 0.1%
263743.31391
< 0.1%

Tertiaire
Real number (ℝ≥0)

HIGH CORRELATION

Distinct8663
Distinct (%)24.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1105.165915
Minimum0
Maximum288175.4001
Zeros6
Zeros (%)< 0.1%
Memory size279.8 KiB
2021-02-18T22:34:06.871322image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile33.49743424
Q194.74988543
median216.2977182
Q3576.155869
95-th percentile3972.063794
Maximum288175.4001
Range288175.4001
Interquartile range (IQR)481.4059836

Descriptive statistics

Standard deviation5164.182507
Coefficient of variation (CV)4.672766718
Kurtosis617.0161261
Mean1105.165915
Median Absolute Deviation (MAD)152.6525931
Skewness19.71232093
Sum39562729.44
Variance26668780.97
MonotocityNot monotonic
2021-02-18T22:34:07.007833image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
42.1110601981
 
0.2%
59.8168468681
 
0.2%
72.7372857874
 
0.2%
63.6451250674
 
0.2%
59.3383120974
 
0.2%
67.9519380372
 
0.2%
50.2461513672
 
0.2%
53.1173600171
 
0.2%
61.2524511869
 
0.2%
70.3446119168
 
0.2%
Other values (8653)35062
97.9%
ValueCountFrequency (%)
06
< 0.1%
0.4785347751
 
< 0.1%
0.957069551
 
< 0.1%
1.4356043251
 
< 0.1%
1.91413912
 
< 0.1%
ValueCountFrequency (%)
288175.40011
< 0.1%
179562.76141
< 0.1%
175581.6191
< 0.1%
173447.58281
< 0.1%
171766.43541
< 0.1%

Interactions

2021-02-18T22:33:52.334882image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:52.448049image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:52.574034image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:52.697469image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:52.814959image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:52.911495image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:53.003765image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:53.117052image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:53.238174image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:53.350430image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:53.436353image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:53.522652image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:53.607785image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:53.690076image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:53.774914image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:53.897467image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:54.015820image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:54.140127image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:54.237931image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:54.326986image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:54.414370image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:54.504563image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:54.592125image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:54.696234image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:54.831510image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:54.953090image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:55.156382image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:55.268046image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:55.353948image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:55.439012image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:55.537248image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:55.663611image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:55.793817image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:55.929188image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:56.047904image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:56.195190image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:56.342852image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:56.477775image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:56.597803image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:56.718457image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:56.814739image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:56.900206image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:57.007580image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:57.122657image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:57.242865image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:57.358714image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:57.474300image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:57.602520image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:57.723519image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:57.848465image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:57.966871image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:58.087835image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:58.180723image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:58.296915image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:58.422878image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:58.548985image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:58.645377image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:58.768598image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:58.900657image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:59.025586image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:59.241760image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:59.372360image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:59.471420image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:59.566073image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:59.658608image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:59.756509image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:59.845850image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:33:59.952657image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:00.070569image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:00.157448image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:00.244413image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:00.328828image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:00.413280image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:00.498976image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:00.585404image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:00.698376image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:00.832247image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:00.957355image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:01.088456image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:01.224490image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:01.363935image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:01.498551image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:01.630951image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:01.767206image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:01.903129image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:02.031365image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:02.157323image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:02.271987image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:02.400315image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-02-18T22:34:02.531173image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-02-18T22:34:07.339224image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-02-18T22:34:07.507020image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-02-18T22:34:07.673805image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-02-18T22:34:07.884567image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-02-18T22:34:02.780183image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-02-18T22:34:03.025840image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-02-18T22:34:03.297208image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-02-18T22:34:03.460396image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

INSEE communeCommuneAgricultureAutres transportsAutres transports internationalCO2 biomasse hors-totalDéchetsEnergieIndustrie hors-énergieRésidentielRoutierTertiaire
001001L'ABERGEMENT-CLEMENCIAT3711.425991NaNNaN432.751835101.4304762.3545586.911213309.358195793.156501367.036172
101002L'ABERGEMENT-DE-VAREY475.330205NaNNaN140.741660140.6754392.3545586.911213104.866444348.997893112.934207
201004AMBERIEU-EN-BUGEY499.043526212.577909NaN10313.4465205314.314445998.3324822930.35446116616.82253015642.42031010732.376930
301005AMBERIEUX-EN-DOMBES1859.160954NaNNaN1144.429311216.21750894.182310276.448534663.6831461756.341319782.404357
401006AMBLEON448.966808NaNNaN77.03383448.401549NaNNaN43.714019398.78680051.681756
501007AMBRONAY4390.996698132.968268NaN3067.2454596144.186081122.437002359.3830941898.63827322481.1235802789.133106
601008AMBUTRIX489.64872660.233517NaN558.05570998.3888852.3545586.911213495.3935923616.168396356.029873
701009ANDERT-ET-CONDON676.468367NaNNaN207.181968149.7670832.3545586.911213141.533662696.956038161.266219
801010ANGLEFORT1070.411510143.815664NaN1286.621447146.525382341.410872175185.892500448.5188013199.8083601206.147643
901011APREMONT2010.790441NaNNaN249.08691452.1037912.3545586.911213151.844891686.153720188.542701

Last rows

INSEE communeCommuneAgricultureAutres transportsAutres transports internationalCO2 biomasse hors-totalDéchetsEnergieIndustrie hors-énergieRésidentielRoutierTertiaire
3578895652VIARMES252.09473555.168176NaN1278.24786643.52691440.027482117.4906276546.1793245705.7776062521.399729
3578995656VIENNE-EN-ARTHIES324.827105NaNNaN98.34615557.1290302.3545586.911213158.726900305.819555206.727023
3579095658VIGNY1289.190261NaNNaN1616.8389952297.572718313.156180919.1913761424.3731388539.961119516.817557
3579195660VILLAINES-SOUS-BOIS447.13289418.996282NaN207.29421692.96691611.77278934.5560671023.037832912.992023336.409947
3579295675VILLERON1335.1233154.975886NaN1625.523404104.075339320.219853939.925016617.1771847740.572805725.467969
3579395676VILLERS-EN-ARTHIES1628.065094NaNNaN165.04539665.06361711.77278934.556067176.098160309.627908235.439109
3579495678VILLIERS-ADAM698.630772NaNNaN1331.126598111.4809542.3545586.9112131395.52981118759.370070403.404815
3579595680VILLIERS-LE-BEL107.564967NaNNaN8367.174532225.622903534.4846071568.84543122613.83025012217.12240013849.512000
3579695682VILLIERS-LE-SEC1090.890170NaNNaN326.748418108.9697492.3545586.91121367.2354874663.23212785.657725
3579795690WY-DIT-JOLI-VILLAGE1495.103542NaNNaN125.23641797.7286124.70911513.822427117.450851504.400972147.867245